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Abstract —The deterministic notions of capacity and entropy 
are studied in the context of communication and storage of 
information using square-integrable, bandlimited signals subject 
to perturbation. The (e, 5)-capacity, that extends the Kolmogorov 
e-capacity to packing sets of overlap at most 5, is introduced and 
compared to the Shannon capacity. The functional form of the 
results indicates that in both Kolmogorov and Shannon’s settings, 
capacity and entropy grow linearly with the number of degrees 
of freedom, but only logarithmically with the signal to noise 
ratio. This basic insight transcends the details of the stochastic 
or deterministic description of the information-theoretic model. 
For 5 = 0, the analysis leads to new bounds on the Kolmogorov e- 
capacity, and to a tight asymptotic expression of the Kolmogorov 
e-entropy of bandlimited signals. A deterministic notion of error 
exponent is Introduced. Applications of the theory are briefly 
discussed. 

Index Terms —Bandlimited signals, capacity, entropy, e- 
capacity, e-entropy, zero-error capacity, A-width, degrees of 
freedom, approximation theory, rate-distortion function. 

I. Introduction 

C Laude Shannon introduced the notions of capacity and 
entropy in the context of communication in 1948 lUl, 
and with them he ignited a technological revolution. His 
work instantly became a classic and it is today the pillar 
of modern digital technologies. On the other side of the 
globe, the great Soviet mathematician Andrei Kolmogorov 
was acquainted with Shannon’s work in the early 1950s and 
immediately recognized that ''his mathematical intuition is re¬ 
markably precise.” His notions of e-entropy and e-capacity ||2l, 
El were certainly influenced by Shannon’s work. The e- 
capacity has the same operational interpretation of Shannon’s 
in terms of the limit for the amount of information that can 
be transmitted under perturbation, but it was developed in the 
purely deterministic setting of functional approximation. On 
the other hand, the e-entropy corresponds to the amount of 
information required to represent any function of a given class 
within e accuracy, while the Shannon entropy corresponds 
to the average amount of information required to represent 
any stochastic process of a given class, quantized at level e. 
Kolmogorov’s interest in approximation theory dated back to 
at least the nineteen-thirties, when he introduced the concept 
of A^-width to characterize the “massiveness” or effective 
dimensionality of an infinite-dimensional functional space ||4l . 
This interest also eventually led him to the solution in the late 
nineteen-fifties, together with his student Arnold, of Hilbert’s 
thirteenth problem 0- 
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Even though they shared the goal of mathematically de¬ 
scribing the limits of communication and storage of informa¬ 
tion, Shannon and Kolmogorov’s approaches to information 
theory have evolved separately. Shannon’s theory flourished 
in the context of communication, while Kolmogorov’s work 
impacted mostly mathematical analysis. Connections between 
their definitions of entropy have been pointed out in 0, 
and we discussed the relationship between capacities in our 
previous work El- The related concept of complexity and its 
relation to algorithmic information theory has been treated ex¬ 
tensively 0, El. Kolmogorov devoted his presentation at the 
1956 International Symposium on Information Theory ifTOl . 
and Appendix II of his work with Tikhomirov 0 to explore 
the relationship with the probabilistic theory of information 
developed in the West, but limited the discussion "at the level 
of analogy and parallelism.” This is not surprising, given the 
state of affairs of the mathematics of functional approximation 
in the nineteen-fifties — at the time the theory of spectral 
decomposition of time-frequency limiting operators, needed 
for a rigorous treatment of continuous waveform channels, had 
yet to be developed by Landau, Pollack and Slepian IfTTlI . IfT^ . 

Renewed interest in deterministic models of information 
has recently been raised in the context of networked control 
theory ifTSll . iflTll . and in the context of electromagnetic wave 
theory ini, m, ini. Motivated by these applications, in this 
paper we define the number of degrees of freedom, or effective 
dimensionality, of the space of bandlimited functions in terms 
of A^-width, and study capacity and entropy in Kolmogorov’s 
deterministic setting. We also extend Kolmogorov’s capacity to 
packing sets of non-zero overlap, which allows a more detailed 
comparison with Shannon’s work. 

A. Capacity and packing 

Shannon’s capacity is closely related to the problem of 
geometric packing “billiard balls” in high-dimensional space. 
Roughly speaking, each transmitted signal, represented by the 
coefficients of an orthonormal basis expansion, corresponds 
to a point in the space, and balls centered at the transmitted 
points represent the probability density of the uncertainty of 
the observation performed at the receiver. A certain amount 
of overlap between the balls is allowed to construct dense 
packings corresponding to codebooks of high capacity, as long 
as the overlap does not include typical noise concentration re¬ 
gions, and this allows to achieve reliable communication with 
vanishing probability of error. The more stringent requirement 
of communication with probability of error equal to zero leads 
to the notion of zero-error capacity ifTSll . which depends only 
on the region of uncertainty of the observation, and not on 
its probabilistic distribution, and it can be expressed as the 
supremum of a deterministic information functional iflTll . 
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Similarly, in Kolmogorov’s deterministic setting communi¬ 
cation between a transmitter and a receiver occurs without 
error, balls of fixed radius e representing the uncertainty 
introduced by the noise about each transmitted signal are not 
allowed to overlap, and his notion of 2e-capacity corresponds 
to the Shannon zero-error capacity of the e-bounded noise 
channel. 

In order to represent a vanishing-error in a deterministic 
setting, we allow a certain amount of overlap between the e- 
balls. In our setting, a codebook is composed by a subset 
of waveforms in the space, each corresponding to a given 
message. A transmitter can select any one of these signals, 
that is observed at the receiver with perturbation at most e. 
If signals in the codebook are at distance less than 2e of 
each other, a decoding error may occur due to the overlap 
region between the corresponding e-balls. The total volume 
of the error region, normalized by the total volume of the e- 
balls in the codebook, represents a measure of the fraction 
of space where the received signal may fall and result in a 
communication error. The (e, 5)-capacity is then defined as 
the logarithm base two of the largest number of signals that 
can be placed in a codebook having a normalized error region 
of size at most S. We provide upper and lower bounds on 
this quantity, when communication occurs using bandlimited, 
square-integrable signals, and introduce a natural notion of 
deterministic error exponent associated to it, that depends only 
on the communication rate, on e, on the signals’ bandwidth, 
and on the energy constraint. Our bounds become tight for 
high values of the signal to noise ratio, and their functional 
form indicates that capacity grows linearly with the number of 
degrees of freedom, but only logarithmically with the signal 
to noise ratio. This was Shannon’s original insight, revisited 
here in a deterministic setting. 

For ^ = 0 our notion of capacity reduces to the Kolmogorov 
2e-capacity, and we provide new bounds on this quantity. By 
comparing the lower bound for 5 > 0 and the upper bound for 
(5 = 0, we also show that a strict inequality holds between the 
corresponding values of capacity if the signal to noise ratio 
is sufficiently large. The analogous result in a probabilistic 
setting is that the Shannon capacity of the uniform noise 
channel is strictly greater than the corresponding zero-error 
capacity. 

B. Entropy and covering 

Shannon’s entropy is closely related to the geometric prob¬ 
lem of covering a high-dimensional space with balls of given 
radius. Roughly speaking, each source signal, modeled as a 
stochastic process, corresponds to a random point in the space, 
and by quantizing all coordinates of the space at a given 
resolution. Shannon’s entropy corresponds to the number of 
bits needed on average to represent the quantized signal. Thus, 
the entropy depends on both the probability distribution of 
the process, and the quantization step along the coordinates 
of the space. A quantizer, however, does not need to act 
uniformly on each coordinate, and can be more generally 
viewed as a discrete set of balls covering the space. The 
source signal is represented by the closest center of a ball 


covering it, and the distance to the center of the ball represents 
the distortion measure associated to this representation. In 
this setting. Shannon’s rate distortion function provides the 
minimum number of bits that must be specified per unit time 
to represent the source process with a given average distortion. 

In Kolmogorov’s deterministic setting, the e-entropy is the 
logarithm of the minimum number of balls of radius e needed 
to cover the whole space and, when taken per unit time, 
it corresponds to the Shannon rate-distortion function, as 
it also represents the minimum number of bits that must 
be specified per unit time to represent any source signal 
with distortion at most e. We provide a tight expression for 
this quantity, when sources are bandlimited, square-integrable 
signals. The functional form of our result shows that the e- 
entropy grows linearly with the number of degrees of freedom 
and logarithmically with the ratio of the norm of the signal to 
the norm of the distortion. Once again, this was Shannon’s key 
insight that remains invariant when subject to a deterministic 
formulation. 

The leitmotiv of the paper is the comparison between 
deterministic and stochastic approaches to information theory, 
and the presentation is organized as follows; In Section II we 
informally describe our results, in section III we present our 
model rigorously, provide some definitions, recall results in the 
literature that are useful for our derivations, and present our 
technical approach. Section IV briefly discusses applications. 
Section V provides precise mathematical statements of our 
results, along with their proofs. A discussion of previous 
results and the computation of the error exponent in the 
deterministic setting appear in the Appendixes. 

II. Description of the results 

We begin with an informal description of our results, that 
is placed on rigorous grounds in subsequent sections. 

A. Capacity 

We consider one-dimensional, real, scalar waveforms of a 
single scalar variable and supported over an angular frequency 
interval [—0,12]. We assume that waveforms are square- 
integrable, and satisfy the energy constraint 



These bandlimited waveforms have unbounded time support, 
but are observed over a finite interval [—T/2,T/2]. In this 
way, and in a sense to be made precise below, any signal 
can be expanded in terms of a suitable set of basis functions, 
orthonormal over the real line, and for T large enough it can 
be seen as a point in a space of essentially 

No = OT/tt (2) 

dimensions, corresponding to the number of degrees of free¬ 
dom of the waveform, and of radius ^/E. 

To introduce the notion of capacity, we consider an un¬ 
certainty sphere of radius e centered at each signal point, 
representing the energy of the noise that is added to the 


3 


observed waveform. In this model, due to Kolmogorov, the 
signal to noise ratio is 

SNRif=i;/e2. (3) 


A codebook is composed by a subset of waveforms in the 
space, each corresponding to a given message. A transmitter 
can select any one of these signals, that is observed at the 
receiver with perturbation at most e. By choosing signals in 
the codebook to be at at distance at least 2e of each other, the 
receiver can decode the message without error. The 2e-capacity 
is the logarithm base two of the maximum number M 2 e{E) 
of distinguishable signals in the space. This geometrically 
corresponds to the maximum number of disjoint balls of radius 
e with their centers situated inside the signals’ space and it is 
given by 

C 2 e = logM 2 e(T;) bits. (4) 


We also define the capacity per unit time 


C2e 


\0gM2e{E) 

iim -—- bits per second. 

T-foo T 


(5) 


A similar Gaussian stochastic model, due to Shannon, 
considers bandlimited signals in a space of essentially Nq 
dimensions, subject to an energy constraint over the interval 
[—T/2,r/2] that scales linearly with the number of dimen¬ 
sions 

fTil 


/ f{t)dt < PNo, 


( 6 ) 


l-T/2 


and adds a zero mean Gaussian noise variable of standard 
deviation a independently to each coordinate of the space. In 
this model, the signal to noise ratio on each coordinate is 


SNRs = P/a^. 


(7) 


Shannon’s capacity is the logarithm base two of the largest 
number of messages M^{P) that can be communicated with 
probability of error i5 > 0. When taken per unit time, this is 

log mHp) 

C = lim - - - bits per second, (8) 

T-^-oo T 

and it does not depend on 5. The definition in ® should 
be compared with (|5]l. The geometric insight on which the 
two models are built upon is the same. However, while 
in Kolmogorov’s deterministic model packing is performed 
with “hard” spheres of radius e and communication in the 
presence of arbitrarily distributed noise over a bounded support 
is performed without error, in Shannon’s stochastic model 
packing is performed with “soft” spheres of effective radius 
\/No<t and communication in the presence of Gaussian noise 
of unbounded support is performed with arbitrarily low prob¬ 
ability of error S. 

Shannon’s energy constraint © scales with the number 
of dimensions, rather than being a constant. The reason for 
this should be clear: since the noise is assumed to act inde¬ 
pendently on each signal’s coefficient, the statistical spread 
of the output, given the input signal, corresponds to an 
uncertainty ball of radius y/77^a. It follows that the norm 
of the signal should also be proportional to \/No, to avoid 
a vanishing signal to noise ratio as Wq —oo. In contrast, in 


the case of Kolmogorov the capacity is computed assuming 
an uncertainty ball of hxed radius e and the energy constraint 
is constant. In both cases, spectral concentration ensures that 
the size of the signals’ space is essentially of Nq dimensions. 
Probabilistic concentration ensures that the noise in Shannon’s 
model concentrates around its standard deviation, so that the 
functional form of the results is similar in the two cases. 

Shannon’s celebrated formula for the capacity of the Gaus¬ 
sian model is HI 

C = — log(-v/1 + SNRs) bits per second. (9) 

TT 

Our results for Kolmogorov’s deterministic model are 

{ C 2 e < — log -I- \/SNRif/2^ bits per second, (10) 

C 2 e > — ^log x/SNRif — 1^ bits per second. (11) 

The upper bound (fTOl i is an improved version of our 
previous one in Q. For high values of the signal to noise 
ratio, it becomes approximately O/tt (logv/SNR^—1/2), 
i.e. tight up to a term VI/{2tt). Both upper and lower bounds 
are improvements over the ones given by Jagerman ifTOl , ll20ll . 
see Appendix lAl for a discussion. 

To provide a more precise comparison between the deter¬ 
ministic and the stochastic model, we extend the deterministic 
model allowing signals in the codebook to be at distance less 
than 2e of each other. We say that signals in a codebook are 
(e, -distinguishable if the portion of space where the received 
signal may fall and result in a decoding error is of measure 
at most 5. The (e, (5)-capacity is the logarithm base two of the 
maximum number M^{E) of (e, (5)-distinguishable signals in 
the space and it is given by 

cf = logMf(T;) bits. (12) 

We also define the (e, (5)-capacity per unit time 

cl = lim bits per second. (13) 

T-^oo T 

In this case, we show, for any e,5>0 

{ cl <— log -I- \/SNRic^ bits per second, (14) 
cl > — log \/SNR/f bits per second. (15) 


As in Shannon’s case, these results do not depend on the 
size of the error region S. They become tight for high values 
of the signal to noise ratio. 

The lower bound follows from a random coding argument 
by reducing the problem to the existence of a coding scheme 
for a stochastic uniform noise channel with arbitrarily small 
probability of error. The existence of such a scheme in the 
stochastic setting implies the existence of a corresponding 
scheme in the deterministic setting as well. Comparing ( fTOb 
and (fTSl l it follows that in the high SNRjf regime, where 


Ve > 


V2 


(16) 
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having a positive error region guarantees a strictly larger 
capacity. Given our proof reduction, this corresponds to having 
a Shannon capacity for the uniform noise channel strictly 
greater than the corresponding zero-error capacity. 

The analogy between the size of the error region in the de¬ 
terministic setting and the probability of error in the stochastic 
setting also leads to a notion of deterministic error exponent. 
Letting the number of messages in the codebook be M = 2^^, 
where the transmission rate R is smaller than the lower bound 
(fTSl l, in Appendix we bound the size of the error region to 
be at most 

6 < logVSNRK-fl)^ 

and the error exponent in the deterministic model is 

Er(i?) = - log \/SNRif - i? > 0, (18) 

TT 

that depends only on fl, E, e, and on the transmission rate R. 
B. Entropy 

We consider the same signal space as above, corresponding 
to points of essentially Nq = QT /tt dimensions and contained 
in a ball of radius ^fE. A source codebook is composed by 
a subset of points in this space, and each codebook point is a 
possible representation for the signals that are within radius e 
of itself. If the union of the e balls centered at all codebook 
points covers the whole space, then any signal in the space 
can be encoded by its closest representation. The radius e of 
the covering balls provides a bound on the largest estimation 
error between any source f{t) and its codebook representation 
/(t). When signals are observed over a finite time interval 
[—T/2,T/2], this corresponds to 

d[f{t),f{t)]= (19) 

J-T/2 

Following the usual convention in the literature, we call this 
distortion measure noise, so that the signal to distortion ratio 
in this source coding model is again SNR/f = '/Eje. 

The Kolmogorov e-entropy is the logarithm base two of the 
minimum number L^{E) of e-balls covering the whole space 
and it is given by 

H^ = log L^{E) bits. (20) 

We also define the e-entropy per unit time 
log LJE) 

iJe = lim - bits per second. (21) 

T-^oo T 

An analogous Gaussian stochastic source model, due to 
Shannon, models the source signal as a white Gaussian 
stochastic process of constant power spectral density P of 
support [—n,0]. This stochastic process has infinite energy, 
and finite average power 

E(f2(f))=i?f(0) = — / Sf{oj)duj= -, (22) 

2'^ J — oo ^ 

where i?f and S'f are the autocorrelation and the power spectral 
density of f{t), respectively. When observed over the interval 
[—r/2,T/2], the process can be viewed as a random point 


having essentially Wq independent Gaussian coordinates of 
zero mean and variance P, and of energy 

/■^/2 PflT 

/ E(f2(f))df =-= PNq. (23) 

J-T/2 

A source codebook is composed by a subset of points in the 
space, and each codebook point is a possible representation 
for the stochastic process. The distortion associated to the 
representation of f{t) using codebook point f{t) is defined 
in terms of mean-squared error 

/■^/2 

d[f(f),f(f)] = / E[f(f) - f(f)]2dL (24) 
J - T /2 

Letting L„ (P) be the smallest number of codebook points that 
can be used to represent the source process with distortion at 
most the rate-distortion function is defined as 

Ra = lim ———- bits per second. (25) 

T—yoo T 

In this setting. Shannon’s formula for the rate distortion 
function of a Gaussian source is IT] 

Rty = — log(\/SNRs) bits per second. (26) 

TT 

We show the corresponding result in Kolmogorov’s deter¬ 
ministic setting 

Ffg = — log(\/SNRif) bits per second. (27) 

TT 

Previously, Jagerman (m, EQi has shown 

0 < iT, < ^ log (l-f 2A/SNRif) , (28) 

see Appendix for a discussion. Our result in (l27l) can 
be derived by combining a theorem of Dumer, Pinsker and 
Prelov iTi\ Theorem 2], on the thinnest covering of ellipsoids 
in Euclidean spaces of arbitrary dimension, our Lemma [T] 
on the phase transition of the dimensionality of bandlimited 
square-integrable functions, and an approximation argument 
given in our Theorem |6l Instead, we provide a self-contained 
proof. 

C. Summary 

Table U provides a comparison between results in the deter¬ 
ministic and in the stochastic setting. In the computation of ca¬ 
pacity, a transmitted signal subject to a given energy constraint, 
is corrupted by additive noise. Due to spectral concentration, 
the signal has an effective number of dimensions Nq. In a 
deterministic setting, the noise represented by the deterministic 
coordinates {ui}, can take any value inside a ball of radius 
e. In a stochastic setting, due to probabilistic concentration, 
the noise represented by the stochastic coordinates {n^}, can 
take values essentially uniformly at random inside a ball of 
effective radius Nga^. In both cases, the maximum cardinality 
of the codebook used for communication depends on the error 
measure b > 0, but the capacity in bits per unit time does not, 
and it depends only on the signal to noise ratio. The special 
case (5 = 0 is treated separately, and it does not appear in the 
table. This corresponds to the Kolmogorov 2e-capacity, and is 
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TABLE I 

Comparison of stochastic and deterministic models 



Stochastic 

Deterministic 

Transmitted Signal 



Additive Noise 



Effective Dimensionality 

II 

At 

II 

At 

Signal to Noise Ratio 

SNRs = P/cr2 

SNRk = Eje^ 

Max Cardinality of Codebook 

M^iP) 

Ml{E) 

Capacity 

C = ^log(Vl + SNRs) 

^ log VSN^ <Ct<% log(l + ySN^) 

Source Signal 

/5t/ 2 E(f2(t))dt = PNo 

f{t)dt < E 

Distortion 

#(*)>/(*)] < 

d[f{t):m]<A 

Min Cardinality of Codebook 

LAP) 

LAE) 

Rate Distortion Function 

Ra = ^ log 

= ^ log 


the analog of the Shannon zero-error capacity of an e-bounded 
noise channel. 

In the computation of the rate distortion function, a source 
signal is modeled as either an arbitrary, or stochastic process of 
given energy constraint. The distortion measure corresponds to 
the estimation error incurred when this signal is represented by 
an element of the source codebook. The minimum cardinality 
of the codebook used for representation depends on the 
distortion constraint, and so does the rate distortion function. 

In both the deterministic and stochastic settings we have a 
tight asymptotic characterization of the rate distortion function, 
while we have bounds for the capacity in the deterministic 
setting that are tight only in the high SNRif regime. This 
is because distances in the probabilistic model are measured 
in terms of standard deviation, while they are measured in 
terms of L‘^[—T /2, T/2] norm in the deterministic model. The 
computation of capacity requires to sum the signal and the 
noise, and in the probabilistic model the norm of the sum 
of two signals can be expressed as the square root of the 
sum of their variances, leading to a tight expression. In the 
deterministic model, the norm of the sum of two signals can 
only be bounded, and this leads to a gap between upper and 
lower bounds that vanishes for high values of SNRif. In the 
case of rate distortion, we do not need to compute the sum of 
two signals, and tight bounds are obtained in both settings. 

III. The signals’ space 

We now describe the signals’ space rigorously, mention 
some classic results required for our derivations, introduce 
rigorous notions of capacity and entropy, and present the 
technical approach that we use in the proofs. 


A. Energy-constrained, bandlimited functions 

We consider the set of one-dimensional, real, bandlimited 
functions 


where 


/ OO 

f{t) exp{-jujt)dt, 

-OO 


(30) 


and j denotes the imaginary unit. 

These functions are assumed to be square-integrable, and 
to satisfy the energy constraint O- We equip them with the 
L^[-T/2,T/2] norm 

ll/ll = ^ fit)dt 

It follows that {Bq, II • id is a metric space, whose elemets are 
real, bandlimited functions, of infinite duration and observed 
over a finite interval [—r/2,r/2]. The elements of this space 
can be optimally approximated, in the sense of Kolmogorov, 
using a finite series expansion of a suitable basis set. 



B. Prolate spheroidal basis set 

Given any T, fl > 0, there exists a countably infinite set 
of real functions called prolate spheroidal wave 

functions (PSWF), and a set of real positive numbers 1 > 
Ai > A 2 > • • • with the following properties: 

Property 1. The elements of {A„} and {'0„} are solutions 
of the Fredholm integral equation of the second kind 

(32) 

J_T Tt{t-S) 

Property 2. The elements of {'0ri(f)} have Fourier transform 
that is zero for |u;| > O. 

Property 3. The set {'0n(i)} is complete in Bq,. 

Property 4. The elements of {fnit)} are orthonormal in 

{ — OO, oo) 

f 'fn{t)tljm{t)dt = \^ ” (33) 

J-oo [ 0 otherwise. 

Property 5. The elements of {V'n(^)} are orthogonal in 

(-Z Z\ 

V 2 ’ 2 / 

T 

f ^ ll)n{t)'fm{t)dt = 


Xn n = m, 

0 otherwise. 


Bn = {fit) : = 0, for |a;| > fl} 


(29) 


(34) 
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Property 6. The eigenvalues in {An} undergo a phase 
transition at the scale of Nq = flT/ tt: for any a > 0 

lim AL(i_«)Aroj = 1, (35) 

tVo-^oo 

lim AL(i+a)Ar(,j = 0. (36) 

A'o^oo 

Property 7. The width of the phase transition can be 
precisely characterized; for any k > 0 

j^jl^^^\jVo+klog(JVon/2)J = ^ ■ (37) 


only slightly larger than IVq. More precisely, for any p > 0 
we may choose an integer 

' NqTT 






o(log7Vo), 

(42) 


and approximate 

eo = |b = (6i,&2,---):E&n<77l, 


(43) 


n—1 


within accuracy p as Nq ^ oo using 


For an extended treatment of PSWF see Il22l . The phase 
transition behavior of the eigenvalues is a key property 
related to the number of terms required for a satisfactory 
approximation of any square integrable bandlimited function 
using a finite basis set. Much of the theory was developed 
jointly by Landau, Pollack, and Slepian, see ca for a review. 
The precise asymptotic behavior in was finally proven 
by Landau and Widom 12^ . after a conjecture of Slepian 
supported by a non-rigorous computation ll24l . 


C. Approximation of Bq 

Let X = L^[—r/2, T/2], the Kolmogorov A^-width ||25]| of 
Bq, in A" is 

dN{BQ,X)= inf sup inf ||/-p||, (38) 

XnQX 

where Xn is an A^-dimensional suspace of X. For any p > 0, 
we use this notion to define the number of degree of freedom 
at level p of the space Bq, as 

N^{Bq) = min{N : clNiBn, X) < p}. (39) 

In words, the Kolmogorov A^-width represents the extent 
to which Bq may be uniformly approximated by an N- 
dimensional subspace of X, and the number of degrees of 
freedom is the dimension of the minimal subspace representing 
the elements of Bq within the desired accuracy p. It follows 
that the number of degrees of freedom represents the effective 
dimensionality of the space, and corresponds to the number 
of coordinates that are essentially needed to identify any one 
element in the space. 

A basic result in approximation theory (see e.g. ll25l Ch. 2, 
Prop. 2.8]) states that 

(40) 

and the corresponding approximating subspace is the one 
spanned by the PSWF basis set {'fn}n=i- follows that any 
bandlimited function / € Bq can be optimally approximated 
by retaining a finite number N of terms in the series expansion 

OO 

f{t) (41) 

n—1 

and that the number of degree of freedom in ( I39I ) is given 
by the minimum index N such that a/A^v-i-i ^ P/'/E- The 
phase transition of the eigenvalues ensures that this number is 


N 


B'^ = ^h={b^Mr-- .bN):J2bl<E\, 

equipped with the norm 


n—1 


ibir = 


N 


A bn^r, 

\ n=l 


(44) 


(45) 


The energy constraint in (l44l) follows from ([T]) using the 
orthonormality Property 4 of the PSWF, the norm in (I45I) 
follows from OTT i using the orthogonality Property 5 of the 
PSWF, and the desired level of approximation is guaranteed 
by Property 7 of the PSWF. 

By (l42l) it follows that the number of degrees of freedom 
is an intrinsic property of the space, essentially dependent on 
the time-bandwidth product Nq = QT/tt, and only weakly, 
i.e. logarithmically, on the accuracy p of the approximation 
and on the energy constraint E. 

These approximation-theoretic results show that any energy- 
constrained, bandlimited waveform can be identified by es¬ 
sentially Wo real numbers. This does not pose a limit on the 
amount of information carried by the signal. The real numbers 
identifying the waveform can be specified up to arbitrary 
precision, and this results in an infinite number of possible 
waveforms that can be used for communication. To bound 
the amount of information, we need to introduce a resolution 
limit at which the waveform can be observed, which allows 
an information-theoretic description of the space using bits 
rather than real numbers. This description is given in terms of 
entropy and capacity. 


D. e-entropy and e-capacity 

Let Al be a subset of the metric space X = L^[—T /2, T/2]. 
A set of points in A is called an e-covering if for any point in 
A there exists a point in the covering at distance at most e from 
it. The minimum cardinality of an e-covering is an invariant of 
the set A, which depends only on e, and is denoted by L):(A). 
The e-entropy of A is defined as the base two logarithm 

Ele(,A) ~ logL^{A) bits, (46) 

see Fig. [I}(a). We also define the e-entropy per unit time 

Ht{A) = lim —bits per second. (47) 

T^oo T 

A set of points in A is called e-distinguishable if the distance 
between any two of them exceeds e. The maximum cardinality 
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Fig. 1. Part (a): Illustration of the e-entropy. Part (b): Illustration of the 
e-capacity. 






Fig. 2. Part (a): Illustration of the en'or region for a signal in the space. The 
letters indicate the volume of the corresponding regions of the ball and 
Ai = {c + e-\-f + g + i)/{a-\-b + c + d-\-e-\-f-\-g + h + i). Part (b): 
Illustration of the (e, <5)-capacity. An overlap among the e-balls is allowed, 
provided that the cumulative error measure A < 5. 


where e is a positive real number, and we let error region with 
respect to minimum distance decoding 

= {x € 5^ : / z : ||x — || < ||x — a^*^||}. (51) 


We define the error measure for the zth signal 

_vol(P*) 

* vol(5*) ’ 


(52) 


where vol(-) indicates volume in X, and the cumulative error 
measure 

1 ^ 

( 53 ) 

i=l 

Fig- ElCa) provides an illustration of the error region for a 
signal in the space. Clearly, we have 0 < A < 1. For 
any (5 > 0, we say that a set of points Al in .4 is (e, (5)- 
distinguishable set if A < (5. The maximum cardinality of an 
(e, (5)-distinguishable set is an invariant of the space A, which 
depends only on e and S, and is denoted by M^{A). The 
(e, ^)-capacity of A is defined as the base two logarithm 


C^{A)—log{A) bits, (54) 


see Fig. |2}(b). We also define the (e, ^)-capacity per unit time 


CHA) 


C^(A) 

lim — - bits per second. 

T-s-oo T 


(55) 


of an e-distinguishable set is an invariant of the set A, which 
depends only on e, and is denoted by M^{A). The e-capacity 
of A is defined as the base two logarithm 

Ce{A) — logM^{A) bits, (48) 

see Fig. [I]-(b). We also define the e-capacity per unit time 

CJA) = lim bits per second. (49) 

T-).oo T 

The e-entropy and e-capacity are closely related to the prob¬ 
abilistic notions of entropy and capacity used in information 
theory. The e-entropy corresponds to the rate distortion func¬ 
tion, and the e-capacity corresponds to the zero-error capacity. 
In order to have a deterministic quantity that corresponds to the 
Shannon capacity, we extend the e-capacity and allow a small 
fraction of intersection among the e-balls when constructing a 
packing set. This leads to a certain region of space where the 
received signal may fall and result in a communication error, 
and to the notion of (e, (5)-capacity. 

E. (e, 5)-capacity 

Let Ahs a subset of the metric space X = T/2, T /2]. 

We consider a set of points lnA,A4 = , a^-^^}. 

For a given 1 < i < M, we let the noise ball 

5* = {xe A-: ||x-a«|| < e}, (50) 


F. Technical approach 

Our objective is to compute entropy and capacity of square 
integrable, bandlimited functions. First, we perform this com¬ 
putation for the finite-dimensional space of functions Bq that 
approximates the infinite-dimensional space Bq up to arbitrary 
accuracy p > 0 in the T/2,T/2] norm, as Nq —> oo. 
Our results in this setting are given by Theorem [T] for the e- 
capacity, Theoreml^for the (e, (5)-capacity, and Theorem[3for 
the e-entropy. Then, in Theorems 01 and |6l we extend the 
computation to the e-capacity, (e, (5)-capacity, and e-entropy of 
the whole space Bq of bandlimited functions. 

When viewed per unit time, results for the two spaces 
are identical, indicating that using a highly accurate, lower¬ 
dimensional subspace approximation leaves only a negligible 
“information leak” in higher dimensions. We bound this leak 
in the case of e-entropy and e-capacity by performing a 
projection from the high-dimensional space Bq onto the lower¬ 
dimensional one Bq and noticing that distances do not change 
significantly when these two spaces are sufficiently close to 
one another. On the other hand, for the (e, (5)-capacity the error 
is defined in terms of volume, which may change significantly, 
no matter how close the two spaces are. In this case, we cannot 
bound the (e, S) capacity of Bn by performing a projection 
onto Bq, and instead provide a bound on the capacity per 
unit time in terms of another finite-dimensional space that 
asymptotically approximates Bq with perfect accuracy p = 0, 
as Nq —>• oo. 










IV. Applications 

Recent interest in deterministic models of information has 
been raised in the context of control theory and electromag¬ 
netic wave theory. 

Control theory often treats uncertainties and disturbances 
as bounded unknowns having no statistical structure. In this 
context, Nair HI introduced a maximin information func¬ 
tional for non-stochastic variables and used it to derive tight 
conditions for uniformly estimating the state of a linear time- 
invariant system over an error-prone channel. The relevance of 
Nair’s approach to estimation over unreliable channels is due 
to its connection with the Shannon zero-error capacity M 
Theorem 4.1], which has applications in networked control 
theory ini. In Appendix iBl we point out that Nairs’ maximum 
information rate functional, when viewed in our continuous 
setting of communication with bandlimited signals, is nothing 
else than C{Ba). This suggests that our approach can be used 
in the same context as his. 

In electromagnetic wave theory, the field measurement accu¬ 
racy, and the corresponding image resolution in remote sensing 
applications, are often treated as fixed constants below which 
distinct electromagnetic fields, corresponding to different im¬ 
ages, must be considered indistinguishable. In this framework, 
the number of degrees of freedom of radiating fields has been 
determined starting from their bandlimitation properties Il26l , 
(Ell. Using the same approach, in communication theory the 
number of parallel channels available in spatially distributed 
multiple antenna systems under a fixed noise level constraint 
has been determined and related to the cut-set boundary 
separating transmitters and receivers El. Our results can be 
used in the same setting to provide the extension from the 
approximation-theoretic notion of degrees of freedom to the 
information-theoretic ones of entropy and capacity, something 
already suggested in ll27l . 

Several other applications of the deterministic approach 
pursued here seem worth exploring, including the analysis 
of multi-band signals of sparse support. More generally, one 
could study capacity and entropy under different constrains 
beside bandlimitation, and attempt, for example, to obtain 
formulas analogous to waterfilling solutions in a deterministic 
setting. 


V. Nothing but proofs 

We start with some preliminary lemmas that are needed 
for the proof of our main theorems. The first lemma is a 
consequence of the phase transition of the eigenvalues, while 
the second and third lemmas are properties of Euclidean 
spaces. 


Lemma 1. Let 


m) 



1/{2N) 


where N = Nq -f O(logiVo) as Nq —^ oo. We have 


lim C(^) = l- 
Nq—¥00 


(56) 


Proof: For any a > 0, we have 


1 ^ 



L(l-a)A^oJ 

log Ai 

2=1 


N 

' E 

i=[(l —a)AfoJ+l 


log Aj 


(58) 


From Property 6 of the PSWF and the monotonicity of the 
eigenvalues it follows that the first sum in ( fSSl l tends to zero 
as Nq oo. We turn our attention to the second sum. By the 
monotonicity of the eigenvalues, we have 

N 

Y logAj > (A^-(1-a)iVo)logAAr. (59) 

i=L(l —a)tVo-|-lJ 

Since N = Nq + O(logiVo) as Nq —>■ oo, there exists a 
constant k such that for Nq large enough N < No + k log Nq 
and the right-hand side is an integer. It follows that for Nq 
large enough, we have 


N 

Y 

i=L(l —«)No + lJ 


log Xi > {aNo + k log Nq) log Xn 
> {oNq + klogNo) 

X log(AAro+feiogAro). 


(60) 

Substituting (l60l) into (l58l l and using Property 7 of the 
PSWF, it follows that for Nq large enough 


log C(iV) > 


oNq +klogNo ( 1 A 

2N °®Vl + e"'V’ 


(61) 


and since N = Nq^ 0{\ogNo) as Nq oo, we have 

lim logC(A^) > I log f • (62) 

No^oo 2 \1 + J 

The proof is completed by noting that a can be arbitrarily 
small. ■ 


Lemma 2. Let m be a positive integer and let 
X, , • • • , xl™! be arbitrary points in n-dimensional 
Euclidean space, (E", || • ||). We have 

mm m 

EE < 2m^ ||x- x^-^lll^. (63) 

j=l k=l i=l 

The proof is given in ESi Lemma 6.1]. 

Lemma 3. Let L be the cardinality of the minimal e-covering 
of the s/E-ball S^ in E". If n > 9, we have 

4e.n3/2(^)” 

L < ---- — [n • In n -I- o(n ■ In n)] (64) 

Inn — 2 

where 1 < < t-^. 

e In n 

The proof is given in Theorem 2]. 


(57) 
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A. Main theorems for Bq 

Although the set Bq in (l44l) defines an A^-dimensional hy¬ 
persphere, the metric in (l45l l is not Euclidean. It is convenient 
to express the metric in Euclidean form by performing a 
scaling transformation of the coordinates of the space. Eor 
all n, we let = bns/Xf, so that we have 


= S a = (ai,a 2 ,--- ,ajv) : 


^ „2 

A„ 

n—1 


<E 


and 


a = 


N 

\ n=l 


(65) 


( 66 ) 


We now consider packing and covering with e-balls inside the 
ellipsoid defined in dbSl l. using the Euclidean metric in 


Theorem 1. For any e > 0, we have 


n 


< 



(67) 


( 68 ) 


Proof: To prove the result it is enough to show the 
following inequalities for the 2e-capacity 

> N log - 1 , (69) 


< N 


log 1 -f 


y/2e 


log ^1 


N 

Y 


(70) 


because limT->.oo C(-A^) = 1 ^ad log (l + y) = 

Lower bound. Let be a maximal (e, 0)-distinguishable 
subset of Bq and M^{Bq) be the number of elements in 
Eor each point of we consider an Euclidean ball whose 
center is the chosen point and whose radius is 2e. Let U be 
the union of these balls. We claim that Bq is contained in U. 
If that is not the case, we can find a point of Bq which is not 
contained in A4^, but whose distance from every point in 
exceeds 2e, which is a contradiction. Thus, we have the chain 
of inequalities 

voI(Sq) < vol(U) < M°(Sn)vol(52e), (71) 

where S 2 e is an Euclidean ball whose radius is 2e and 
the second inequality follows from a union bound. Since 
vol(iSe) = (3]s[ ■ , where Pm is the volume of iSi, by (ItTI i 

we have 


N 


^ol(gn) 

TOl(5e) 




(72) 


Since Bq is an ellipsoid of radii we also have 

vol(Sn) = n {CiN)y/E) , 


and 


Z2!(Ba) , 

vol(S.) t 


N 


(73) 


( 74 ) 


By combining ( l72b and ( l74b . we get 


C°ABn) = logM°(Sn) > N 


log - 1 


■ (75) 


Upper bound. We define the auxiliary set 

Sq = S a = («i) 02, • • • , Oat) : X] ^ r ■ 


The corresponding space {Bq, || • ||') is Euclidean. Since Bq C 
Bq, it follows that C^{Bq) < C^(Bq) and it is sufficient to 
derive an upper bound for C^(Bq). 

Let be a maximal (e, 0)- 

distinguishable subset of Bq, where M = M^(Bq). Let 
a^®^) • • • be any subset of Ad®. Eor any integer 

j k, j,k G {1,... m}, we have 

||a(b) > 2e, (77) 


and 

m m 

EE Ijabj) _ > 4:t^m{m — 1). (78) 

3 = 1 k=l 

By Lemma 12] it follows that 


^ ||a-a(*i)||'2 > 2e2(m-1). (79) 

i=i 

We now define the function 

7 (x) = max{0,1 - (80) 

and for any a £ E^, we let Ada = be 

a subset of Ad° whose distance from a is not larger than s/2e. 
We have 


M 

E^dla-a^^’ll') 

m 

= ^7(||a-a(®'')|r) 


k^l 

k^l ^ 

- m 

^ k^l 


< m — (m — 1) 

= 1, (81) 

where the last inequality follows from ( |79] |. If a ^ 5 ,/M+y/ 2 e’ 
then = 0 because Ada = 0. By using 

dSB and this last observation, we perform the following 
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computation: 

{^^/E+V2e) = 


da. 


y/E+ \/2e 


> 


M 


^7(||a- a(j)||')da 


s/B+V2t j = l 


M 

/ 7(l|a-a(^^||')da 

M ( 7 (||a||')da 

M [ j{x)d{PNX^) 
Jo 


rV2e 

PnMN / 7(a;)x 

N + 2 


^-^dx 


{V2ef, 


(82) 


where (Jn is the volume of 5i in E^. Since vol (5= 
/3,v(v^ + v^e)^ we obtain 




N 


=M< 


1 + 


V2e j 


(83) 


The proof is completed by taking the logarithm. 
Theorem 2. For any 0 < i5 < 1 and e > 0, we have 

n 




Ct{B^) < - 

TT 


log 




log (l + ^ 


(84) 

(85) 


Proof: To prove the result it is enough to show the 
following inequalities for the (e, 5)-capacity 


Cl{B^) > N 
CUb'^) < N 


log 


log 1 + 


e 

Ve' 




log (5, 


log 


l-(5’ 


( 86 ) 


(87) 


because lim7’_>.oo C(-^) = 1 both log <5 and log are 

o(T). 

Lower bound. We show that there exists a codebook f4 = 
{a^^), a^^), • • • , a^-^)}, where 

M = (5fc(iV) —^ , (88) 


that has cumulative error measure A < i). To prove this result, 
we consider an auxiliary stochastic communication model 
where the transmitter selects a signal uniformly at random 
from a given codebook and, given the signal a^’’^ is sent, the 
receiver observes a^*^ + n, with n distributed uniformly in 
S^. The receiver compares this signal with all signals in the 
codebook and selects the one that is nearest to it as the one 
actually sent. The decoding error probability of this stochastic 


communication model, averaged over the uniform selection of 
signals in the codebook, is given by 


1 ^vol(T>*) 
~ M ^ vol(S^) 

i—1 ^ ^ 


(89) 


and by (l52]) and (l53l l it corresponds to the cumulative error 
measure A of the deterministic model that uses the same 
codebook. It follows that in order to prove the desired lower 
bound in the deterministic model, we can show that there exists 
a codebook in the stochastic model satisfying dSST l. and whose 
decoding error probability is at most S. This follows from a 
standard random coding argument, in conjunction to a less 
standard geometric argument due to the metric employed. 

We construct a random codebook by selecting M signals 
uniformly at random inside the ellipsoid Bq. We indicate 
the average error probability over all signal selections in the 
codebook and over all codebooks and by Perr- Since all 
signals in the codebook have the same error probability when 
averaged over all codebooks, P^rr is the same as the average 
error probability over all codebooks when is transmitted. 
Let in this case the received signal be y and let Sf be an 
Euclidean ball whose radius is e and center is y. 

The probability that the signal y is decoded correctly is 
at least as large as the probability that the remaining M — 1 
signals in the codebook are in Bq \Sf. By the union bound, 
we have 


1 - Pe. . 


> 1 - (M- 1) 


> 1-M 


= 1-M 


voi(5y) 
vol(Sn) 
vol(5y) 


vol(Sn) 

(c(A^) 


N 


(90) 


Letting M = 5 > we have Pe^-r < 5- This 

implies that there exist a given codebook for which the 
average probability of error over the selection of signals in 
the codebook given in ( |89] | is at most 5. When this same 
codebook is applied in the deterministic model, we also have 
a cumulative error measure A < S. 

Upper bound. Let Alf be a maximal (e, 5)-distinguishable 
subset of Pq M^{Bq) = M be the number of elements 
in . Let Bq be the union of Bq and the trace of the inner 
points of an e-ball whose center is moved along the boundary 
of Bq, as depicted in Fig. [3 

Since U^i '5* L Bq, we have 


vol 



< vol (Bq'^ . 


(91) 


Since 5* = 1+J^i(5® \P*), where 1+J indicates disjoint 
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Fig. 3. Illustration of the relationship between Bq and Bq. 


union, we obtain 

/ M 

voi y 5* 


M 






M 




i=l 

M 


1 - 


vo\{V^) 
vol(5*) 

Xvol(5*)(l-A,) 


= M-vol(5,)(l-A). 

Since Sq C 5 (HU) can be rewritten as 

Mvol(5,)(l - A) < vol (5^+J 
or equivalently 

PI^AB'n) = M< ^ 


N 


Theorem 3. For any e > 0, we have 




n 


log 




Proof: To prove the result it is enough to show the 
following inequalities for the e-entropy 

.y/E 


He{Ba) > N 
He{B^) < N 


log ^{N)^ 

‘ Ve' 


log 


V[N), 


vol(Sn) < L,(Sa)vol(5,), 


> C(^) 




N 


The proof is completed by taking the logarithm. 
Upper bound. We define the auxiliary set 


Bq= <1^= ( 01 , 02 ,-•• 


N y 

: X®« - ^ f • 

n=l J 


The corresponding space {Bq, || • ||') is Euclidean. Since Bq C 
Bq, it follows that H^{Bq) Edf_{'BQ^ and it is sufficient to 
derive an upper bound for H,:{Bq). 

Let £<: be a minimal e-covering subset of Bq and L^{Bq) 
be the number of elements in C^. By applying Lemma [U we 
have 


LM < 


4eA3/2 

InA - 2 


N 


[A^lniV + o(A^lniV)] , (101) 


for A > 9 and 1 < By taking the logarithm, we 

have 

< A log 


( 4 ) 


log ( 


/ 4eA3/2 


InA - 2 


[AlnA-bo(AlnA)] , (102) 


(92) 

(93) 

(94) 


Since CI{B'q) = logM^{BQ) and A < S, the result 
follows. ■ 


Letting 77 (A) be equal to the second term of (1102b the result 
follows. ■ 

B. Main theorems for Bn 

We now extend results to the full space Bq,- We define the 
auxiliary set 

ea = |b = (6 i,--- ,6^,0,0,---):X&n<i^| (103) 

whose norm is the same as Ba- We also use another auxiliary 
set 


N' 


(95) 


BQ=lh={W,h2,--- ,bN,):J2bl<E 

I n=l 

equipped with the norm 


(104) 


Ibll" = 


(96) 

(97) 


N' 


yy^biXn. 


\ ^ 

^ n—l 

where A' = (1 -|- a) Aq for an arbitrary a > 0. 
Theorem 4. For any e > 0, we have 


(105) 


where 77 (A) = o{T) and lim 7 -->.oo C(-^) = 1- 
Lower bound. Let be a minimal e-covering subset of Bq 
and L,^{Bq) be the number of elements in £ 5 . Since is an 
e-covering, we have 


C°(Sn) > - 

TT 


(98) 


C°(Bo) < 


n 




log 


:- 3 ) 


(106) 

(107) 


where 5^ is an Euclidean ball whose radius is e. By combining 
dll and (l98l l, we have 


Proof: By the continuity of the logarithmic function, to 
prove the upper bound it is enough to show that for any e > 
/r > 0 


(99) 


C°(Bn) < 


VL 


log (1 + -AAA 


(108) 
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and in order to prove ( 1106b and ( 1108b , it is enough to show the Since C'°_^(Sq) = the result follows, 

following inequalities for the 2 e-capacity: for any e > fj, > 0 ■ 


C°ABn) > (109) 

C°ABn)<Cl^{B'^), (110) 

and then apply Theorem [T] 

Lower bound. Let be a maximal (e, 0)-distinguishable 
subset of Bq whose cardinality is 2*^^ Similary, let f be a 
maximal (e, 0)-distinguishable subset of ^ whose cardinality 
is Note that £ is also a (e, 0)-distinguishable subset 

of Bq. Thus, we have 

2 C“(&) = |£:| < |2?| = (Ill) 

From which it follows that 

C°ABn)<CUBn). (112) 

Since C^Bq) = C^Bq), the result follows. 

Upper bound. For any e > p > 0, we consider a projection 
map /3^ : Bq Bq. Let be a maximal (e, 0)-distinguishable 
subset of Bq whose cardinality is Similary, let £ 

be a maximal (e — /i, 0)-distinguishable subset of B^ whose 
cardinality is n). 

We define £' = /3^(T>). In general, /3^ is not one-to-one 
correspondence, however |T>| = \£'\. If this is not the case, 
then there exist a pair of points € T> satisfying 

/?^(bl^^) = /3^(bl^^) = a, and we have 

||b(i) _b(2)|| = ||b(i) -a + a-b(2 )|| 

<||bW-a|| + ||a-b(2)|| 

< p-\- p 

< 2e, (113) 

which is a contradiction. Thus, we have 

2'^“(®“) = |T>| = If'l. (114) 

The distance between any pair of points in £' exceeds 2(e — 
p). If this is not the case, then there exist a pair of points in 
£' whose distance is smaller than 2(e — p). These two point 
can be represented by a*^^l = /3^(b*^^l) and a^^l = /3^(b*^^l), 
where bl^\ b^^l G V. It follows that 

||b(i) - b(2)|| = ||b(i) - all) ^ ^(1) _ ^(2) ^ ^(2) _ ^,(2)|| 

< ||b(i)-ali)|| + ||ali)-al2)|| 

+ ||a(2 )-b(2)|| 

< /r -|- 2(e — p) p 

<2e, (115) 


Theorem 5. For any 0 < i5 < 1 and e > 0, we have 


cAbq) > - 

TT 


cAbq) < 


£l 


log 


( 4 ) 


log (it f) 


(119) 

( 120 ) 


Proof: In this case, while the lower bound follows from 
a corresponding inequality on the (e, (5)-capacity, the upper 
bound follows from an approximation argument and holds for 
the (e, 5)-capacity per unit time only. 

Lower bound. Let £ be a maximal (e, (5')-distinguishable 
subset of Bq whose cardinality is 2*^= (^n). We define a map 
a : Bq ^ Bq such that, for b = (&i, • • • , b^) € Bq, we have 

a(b) = ( 6 i,--- ,bN,0,0,---)GBQ. ( 121 ) 


Then a{£) is a (e, i5")-distinguishable subset of Bq where 
S” 0 for 6' 0. Thus, we can choose 6' whose 

corresponding S” is smaller than 6. In this case, we have 


2C!\Ba) ^ |£-| < 

( 122 ) 

Also, since S” < 6, we have 


c!"{Bq) < cAbq). 

(123) 

By combining (11221) and (11231). we obtain 


c!\Bq) < cAbq). 

(124) 

The result now follows from Theorem |2l 

Upper bound. We define 


d{BQ,BQ) = sup inf ||/-g|| 

f&BngeB” 

(125) 

which is a measure of distance between Bq and Bq. 
Property 6 of the PSWF, we have 

From the 

d{BQ,BQ)^0 as Wo —>■ oo. 

(126) 

which implies 

cAbq) = cABq). 

(127) 


Thus, in order to prove the upper bound of C^Bq), it is 
sufficient to derive an upper bound for CI{Bq). 

By using a the same proof technique as the one in Theorem 
| 2 ] we obatin 


which is a contradiction. Thus, £' is a {e—p, 0)-distingushiable 
subset of Bq, and we have 


CUBq) < TV' 


log 





(128) 


\£'\ < \£\ = 

(116) 

which implies 



By combinining (11141) and (11161). we obatin 
2C°AB^) = |P| = |£:'| < |£;| = 

(117) 

CABQ)<{l + a)- 

TT 

log (i+ 

(129) 

From which it follows that 


Since a is an arbitrary positive number, the result follows. 


(118) 



■ 
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Theorem 6. For any e > 0, we have 


He{Bn) = 


n 


log 




(130) 


Proof: By the continuity of the logarithmic function, to 
prove the result it is enough to show that for any e > p, > 0 


He{Bn) > - 

TT 

n 

H,{Bn) < - 



(131) 


(132) 


and in order to prove (1131b and (1132b . it is enough to show 
the following inequalities for the e-entropy: for any e > /i > 0 

H,{Ba) > H,{Ba) (133) 

H,{Bn) < (134) 

and then apply Theorem [3] 

Lower bound. For any e > p > 0, we condiser a projection 
map : Bq, . Let 22 be a minimal e-covering subset of 

Bq whose cardinality is Similary, let f be a minimal 

e-covering subset of Bq whose cardinality is 

We define £' = /3^(22). We claim that £' is also a e-covering 
subset of Bq. Let p be a point of Bq. Since 22 is an e-covering 
subset of Bq and Bq C Bn, there exists a point b G 22 such 
that ||b — pII < e. Note that ||/3/i(b) — p|| < ||b — p|| and 
/3/i(b) G £'. This means that, for any point p G Bq, there 
exists a point in £' whose distance from p is eqaul or less 
than e, which implies £' is a e-covering subset of Bq. Thus, 
we have 

\£'\>\£\ = 2^^^^^\ (135) 

Since |22| > \£'\, we obtain the following chain of inequlities: 

2HdBa) ^ |p| > |£-/| > |£-| ^ (136) 


From which it follows that 


22,(Bq) < H,{Bn). (137) 

Since H^{B_q) = H^{Bq), the result follows. 

Upper bound. Let 22 be a minimal e-covering subset of 
Bq whose cardinality is Similary, let B be a minimal 

(e —/r)-covering subset of Bq whose cardinality is 
We claim that £ is also an e-covering subset of Bq. Let p 
be a point of Bq. Since £ is an (e — /i)-covering subset of 
Bq and /3^(p) G Bq, there exists a point a G B such that 
I|a-/5M(P)II < e-B- Then, 

l|a-p|| = ||a-/3^(p)-b/3^(p)-p|| 

< I|a-/3M(P)II + II^A*(P) -pII 

< e — M + M 

= e. (138) 

This means that, for any point p G Bq, there exists a point in 
£ whose distance from p is eqaul or less than e, which implies 
£ is an e-covering subset of Bq. Thus, we have 

^ |£;| > |p| ^ (139) 



Fig. 4. Lattice packing in Jagerman’s lower bound. 


From which it follows that 

He->.iBa) > H,{Bn). (140) 

Since H^-i^{Bq) = He-fj.{BQ), the result follows. 


Appendix 

A. Comparison with Jagerman’s results 

A basic relationship between e-entropy and e-capacity, given 
in 13], is 

C2e{A) < H,{A). (141) 

It follows that a typical technique to estimate entropy and 
capacity is to find a lower bound for C 2 e and an upper bound 
for i 2 e, and if these are close to each other, then they are good 
estimates for both capacity and entropy. 

Following this approach, Jagerman provided a lower bound 
on the 2 e-capacity and an upper bound on the e-entropy of 
bandlimited functions. In our notation, the lower bound CH 
Theorem 6 ] can be written as 

<142) 

where the result is adapted here to real signals. 

Jagerman’s proof roughly follows the codebook construction 
corresponding to the lattice packing depicted in Figure H) In 
higher dimensions the side length of the hypercube corre¬ 
sponding to the square in Figure |4] becomes 2y^E/No, which 
divided by the diameter 2 e of the noise sphere gives the 
leading term •y/SNRif/A'o inside the logarithm. The precise 
result requires a more detailed analysis of the asymptotic 
dimensionality of the space. This lower bound becomes very 
loose as TVo —?► oo. In this case, by using the Taylor expansion 
of log(l-|-a;) for x near zero in (11421) . it follows that C 2 e grows 
only as and, as a consequence, we have the trivial lower 
bound on the 2 e-capacity per unit time 

C2e > 0 . 


(143) 
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Geometrically, this is due to the volume of the high¬ 
dimensional sphere tending to concentrate on its boundary. For 
this reason, the packing in the inscribed hypercube in Figure 0] 
captures only a vanishing fraction of the volume available 
in the sphere. In contrast, our lower bound in Theorem [T] is 
non-constructive, and it gives the correct scaling order of the 
number of bits that can be reliably communicated over the 
channel, namely Nq rather than y/Ng, yielding a non-trivial 
lower bound on the 2 e-capacity per unit time. 

In the same paper, Jagerman derives an upper bound on 
the e-entropy lfT9l Theorem 8 ] by applying Mitjagin’s theo¬ 
rem 1^ . which relates entropy to the Kolmogorov A^-width. 
This standard technique is also illustrated in ll^ Theorem 8]. 
For bandlimited signals, Jagerman further improves Mityagin’s 
bound in a subsequent paper ll20l Theorem 1], obtaining in our 
notation 

He < N log 

where 0 < ^ < e and N is defined in (02]). Since p is an 
arbitrary positive number, (11441 ) can be approximated by 

He < N\og (2\/SNRx + l) . (145) 

The e-entropy per unit time is then bounded as 

He<- log(2A/SNRK -b 1). (146) 

TT 

By combining (1141b . (11431) and (1146b . Jagerman obtains 

0 < iF, < ^ log (2 a/SNRk -b l) , (147) 

while we provide a tight characterization of the same quantity 
in Theorem| 6 ]of this paper. If we use this tight result to bound 
the 2e-capacity using the classic approach of (1141b . we obtain 

C2e < — log i/SNRir, (148) 

TT 

while our direct bounds given in Theorem [1] yield, for high 
values of SNRj^, 

-(log v^si^- 1 ) < C 2 . < -(log v/sr^- 1 / 2 ). 

TT "TT _ 


' 2 VE 


€ — e — fi 


(144) 


B. Relationship with Nair’s work 

Nair defined the peak maximum information rate i?* in lfT4l 
Lemma 4.2] and showed equals the zero-error capacity M 
Theorem 4.1]. In his paper, Nair defined i?* for a discrete time 
channel, but this definition can be modified for a continuous 
time channel as follows: 

i?, = lim sup (150) 

T^oo X:XcBn ^ 


where Y is the uncertain output signal yielded by X. 

When we consider our channel model, it is clear that 
the supremum is achieved when X is a maximal 2 e- 
distinguishable set, M. 2 e- In this case, H{X;Y) = log|X| = 
logM 2 e(Sn). Thus (1150b can be rewritten as follows: 




lim 
T ^00 


\ogM2e{Bn) 


(151) 


The right-hand side of ( 1151b is the definition of C 2 e{Bn)- 
Thus, we conclude that C' 2 e(I 5 n) is a peak maximum infor¬ 
mation rate and equals the zero-error capacity in our setting. 


C. Derivation of the error exponent 
By (m, we have 

A = Perr < M { - 

- \aN)VE 

Let M = 2^^, where the transmission rate R is smaller than 
the lower bound on Cf. Then, (1152b can be rewritten as 

A = (153) 

In a stochastic setting the error exponent is defined as the 
logarithm of the error probability. It follows that we may also 
define the error exponent in our deterministic model 

Er(i?) = ^ log - R- (154) 

Since N/T tends to U/tt and C,{N) tends to 1 as T —00 , 
we can approximate the error exponent when Nq is sufficiently 
large by 

Er(i?) = ^log -i?. (155) 



Aknowledgments. The question of determining a notion of 
error exponent in a deterministic setting was raised by Francois 
Baccelli, following the presentation of 171 . 
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